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FAST COMPUTATION OF POWER SERIES SOLUTIONS 
OF SYSTEMS OF DIFFERENTIAL EQUATIONS 

A. BOSTAN, F. CHYZAK, F. OLLIVIER, B. SALVY, E. SCHOST, AND A. SEDOGLAVIC 

Abstract. We propose new algorithms for the computation of the first A'^ terms of a vector (resp. 
a basis) of power series solutions of a linear system of differential equations at an ordinary point, 
using a number of arithmetic operations which is quasi-linear with respect to A^. Similar results 

5^ [ are also given in the non-linear case. This extends previous results obtained by Brent and Kung 

fvj ■ for scalar differential equations of order one and two. 

U ' 

Oh; 

1. Introduction 

in 

^N ■ In this article, we are interested in the computation of the first A'^ terms of power series solutions 

of differential equations. This problem arises in combinatorics, where the desired power series is a 
^^ I generating function, as well as in numerical analysis and in particular in control theory. 

Z/y • Let IK be a field. Given r + 1 formal power series ao(i), . . . , ar{t) in ]K[[t]], one of our aims is to 

c/3 , provide fast algorithms for solving the linear differential equation 

(1) arit)y^''\t) + ■■■ + aiit)y'{t) + ao(t)y(t) = 0. 

Specifically, under the hypothesis that t = is an ordinary point for Equation (1) (i.e., 0^(0) 7^ 0), 
we give efficient algorithms taking as input the first A^ terms of the power series ao(t), • • • ,arit) 
and answering the following algorithmic questions: 

i. find the first N coefficients of the r elements of a basis of power series solutions of (1); 
ii. given initial conditions ao, ■ ■ ■ , Or-i in IK, find the first N coefficients of the unique solu- 
V^ , tion y{t) in M.[[t]] of Equation (1) satisfying 

5: y{0) = ao, y'(0) = ai, ..., y^'-^\0) = ar-i. 

J-^ ' More generally, we also treat linear first-order systems of differential equations. From the data of 

initial conditions v in MrxrO^) (resp. Mrxi(^)) and of the first N coefficients of each entry of the 
k>( I matrices A and B in A^rxr(I^[[i]]) (resp. b in A^rxi(IK[[t]])), we propose algorithms that compute 

^ ■ the first N coefficients: 

I. of a fundamental solution Y in MrxrO^M) of Y' = AY + B, with Y{0) = v, det y(0) / 0; 
II. of the unique solution y{t) in A^rxi(I^[[i]]) of y' = Ay + b, satisfying y{0) = v. 
Obviously, if an algorithm of algebraic complexity C (i.e., using C arithmetic operations in K) is 
available for problem II, then applying it r times solves problem I in time r C, while applying it 
to a companion matrix solves problem ii in time C and problem i in r C. Conversely, an algorithm 
solving i (resp. I) also solves ii (resp. II) within the same complexity, plus that of a linear 
combination of series. Our reason for distinguishing the four problems i, ii, I, II is that in many 
cases, we are able to give algorithms of better complexity than obtained by these reductions. 

The most popular way of solving i, ii, I, and II is the method of undetermined coefficients that 
requires 0{r'^N'^) operations in K. for problem i and 0{rN'^) operations in IC for ii. Regarding the 
dependence in iV, this is certainly too expensive compared to the size of the output, which is only 
linear in N in both cases. On the other hand, verifying the correctness of the output for ii (resp. i) 
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already requires a number of operations in K which is hnear (resp. quadratic) in r: this indicates 
that there is httle hope of improving the dependence in r. Similarly, for problems I and II, the 
method of undetermined coefficients requires 0{N'^) multiplications of r x r scalar matrices (resp. 
of scalar matrix- vector products in size r), leading to a computational cost which is reasonable with 
respect to r, but not with respect to N . 

By contrast, the algorithms proposed in this article have costs that are linear (up to logarithmic 
factors) in the complexity M(A^) of polynomial multiplication in degree less than A^ over K. Using 
Fast Fourier Transform (FFT) these costs become nearly linear — up to polylogarithmic factors — 
with respect to N , for all of the four problems above (precise complexity results are stated below). 
Up to these polylogarithmic terms in N , this estimate is probably not far from the lower algebraic 
complexity one can expect: indeed, the mere check of the correctness of the output requires, in 
each case, a computational effort proportional to N . 

1.1. Newton Iteration. In the case of first-order equations (r = 1), Brent and Kung have shown 
in [8] (see also [15, 23]) that the problems can be solved with complexity 0{M{N)) by means of a 
formal Newton iteration. Their algorithm is based on the fact that solving the first-order differential 
equation y'{t) = a{t)y{t), with a{t) in IC[[t]] is equivalent to computing the power series exponen- 
tial exp( J a{t)). This equivalence is no longer true in the case of a system Y' = A{t)Y (where A{t) is 
a power series matrix): for non-commutativity reasons, the matrix exponential Y{t) = exp(f A{t)) 
is not a solution of Y' = A{t)Y. 

Brent and Kung suggest a way to extend their result to higher orders, and the corresponding 
algorithm has been shown by van der Hoeven in [40] to have complexity 0{r^ M(iV)). This is good 
with respect to A^, but the exponential dependence in the order r is unacceptable. 

Instead, we solve this problem by devising a specific Newton iteration for Y' = A{t)Y. Thus we 
solve problems i and I in C'(MM(r, N)), where IVlM(r, N) is the number of operations in K. required 
to multiply rxr matrices with polynomial entries of degree less than N. For instance, when K = Q, 
this is 0{r^N -\- r^M(iV)), where r^ can be seen as an abbreviation for MM(r, 1), see §1.5 below. 

1.2. Divide-and-conquer. The resolution of problems i and I by Newton iteration relies on the 
fact that a whole basis is computed. Dealing with problems ii and II, we do not know how to 
preserve this algorithmic structure, while simultaneously saving a factor r. 

To solve problems ii and II, we therefore propose an alternative algorithm, whose complexity is 
also nearly linear in A^ (but not quite as good, being in 0{M{N) log A^)), but whose dependence in 
the order r is better — linear for i and quadratic for ii. In a different model of computation with 
power series, based on the so-called relaxed multiplication, van der Hoeven briefiy outlines another 
algorithm [40, Section 4.5.2] solving problem ii in 0{rM{N)\ogN). To our knowledge, this result 
cannot be transferred to the usual model of power series multiplication (called zealous in [40]). 

We use a divide-and-conquer technique similar to that used in the fast Euclidean algorithm [22, 
35, 39]. For instance, to solve problem ii, our algorithm divides it into two similar problems of 
halved size. The key point is that the lowest coefficients of the solution y{t) only depend on the 
lowest coefficients of the coefficients a,. Our algorithm first computes the desired solution y{t) 
at precision only A^/2, then it recovers the remaining coefficients of y{t) by recursively solving at 
precision A^/2 a new differential equation. The main idea of this second algorithm is close to that 
used for solving first-order difference equations in [13]. 

We encapsulate our main complexity results in Theorem 1 below. When FFT is used, the 
functions M{N) and MM(r, A^) have, up to logarithmic terms, a nearly linear growth in A^, see §1.5. 
Thus, the results in the following theorem are quasi-optimal. 

Theorem 1. Let N and r be two positive integers and let M. be a field of characteristic zero or at 
least N . Then: 
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Problem 

(input, output) 


constant 
coefficients 


polynomial 
coefficients 


power series 
coefficients 


output 
size 


i (equation, basis) 


0(M(r)iV) * 


Oidr^N) 


0{MM{r,N)) 


0{rN) 


ii (equation, one solution) 


0{M{r)N/r) * 


0{drN) 


0{rM{N)logN) * 


0{N) 


I (system, basis) 


0{rM{r)N) * 


Oidr'^N) 


0{MM{r,N)) 


0{r^N) 


II (system, one solution) 


0(M(r)iV) * 


Oidr^N) 


0{r^ M {N) log N)* 


0{rN) 



Table 1. Complexity of solving linear differential equations/systems for N ^ r. 
Entries marked with a '*' correspond to new results. 



(a) problems i and I can be solved using O (MM(r, A^)) operations in K; 

(b) problem ii can be solved using O (r M(A^) log A^) operations in IK; 

(c) problem II can be solved using O (r^ M(A^)logA^) operations in K. 

1.3. Special Coefficients. For special classes of coefficients, we give different algorithms of better 
complexity. We isolate two important classes of equations: that with constant coefficients and that 
with polynomial coefficients. In the case of constant coefficients, our algorithms are based on the 
use of the Laplace transform, which allows us to reduce the resolution of differential equations 
with constant coefficients to manipulations with rational functions. The complexity results are 
summarized in the following theorem. 

Theorem 2. Let N and r be two positive integers and let "K be a field of characteristic zero or at 
least N . Then, for differential equations and systems with constant coefficients: 

(a) problem i can be solved using O (M(r) (r + A^)) operations in K; 

(b) problem ii can be solved using O (M(r) (1 + N/r)) operations in IK; 

(c) problem I can be solved using O [r^^^^ logr + rM(r)A^) operations in IK; 

(d) problem II can be solved using O (r'^logr + M{r)N) operations in IK. 

In the case of polynomial coefficients, we exploit the linear recurrence satisfied by the coefhcients 
of solutions. In Table 1, we gather the complexity estimates corresponding to the best known 
solutions for each of the four problems i, ii, I, and II in the general case, as well as in the above 
mentioned special cases. The algorithms are described in Section 4. In the polynomial coefficients 
case, these results are well known. In the other cases, to the best of our knowledge, the results 
improve upon existing algorithms. 



1.4. Non-linear Systems. As an important consequence of Theorem 1, we improve the known 
complexity results for the more general problem of solving non-linear systems of differential equa- 
tions. To do so, we use a classical reduction technique from the non-linear to the linear case, see for 
instance [34, Section 25] and [8, Section 5.2]. For simplicity, we only consider non-linear systems 
of first order. There is no loss of generality in doing so, more general cases can be reduced to that 
one by adding new unknowns and possibly differentiating once. The following result generalizes [8, 
Theorem 5.1]. Ii F = (Fi, . . . , F^) is a differentiable function bearing on r variables yi, . . . ,yr, we 
use the notation Jac(i^) for the Jacobian matrix {dFi/dyj)i<ij<r- 

Theorem 3. Let N, r be in N, let K &e a field of characteristic zero or at least N and let ip 
denote [ipi, . . . , (pr), where fi{t, y) are multivariate power series in IK[[t, yi, . . . , y^]]. 



Let L : N ^ N be such that for all s{t) in Mrxi(^[[t]]) and for all n in N, the first n terms 
of ip{t,s{t)) and of Jac{ip){t,s{t)) can be computed in L(n) operations in K. Suppose in addi- 
tion that the function n >-^ L(n)/n is increasing. Given initial conditions v in Airxii^), if the 
differential system 

y' = v{t,y), y{o) = v, 

admits a solution in 7Wrxi(IK[[t]]), then the first N terms of such a solution y{t) can be computed 
in O {\-[N) +min(MM(r, iV),r2M(iV)logiV)) operations in K. 

Werschulz [41, Theorem 3.2] gave an algorithm solving the same problem using the integral 
Volterra-type equation technique described in [34, pp. 172-173]. With our notation, his algorithm 
uses O (\-{N) + r^A^ M(A^))) operations in IC to compute a solution at precision A^. Thus, our 
algorithm is an improvement for cases where L(A^) is known to be subquadratic with respect to N . 

The best known algorithms for power series composition in r > 2 variables require, at least 
on "generic" entries, a number L(n) = 0(n''~^M(n)) of operations in IC to compute the first n 
coefficients of the composition [7, Section 3]. This complexity is nearly optimal with respect to 
the size of a generic input. By contrast, in the univariate case, the best known result [8, Th. 2.2] 
is L(n) = 0{yJn\ogn M{n)). For special entries, however, better results can be obtained, already in 
the univariate case: exponentials, logarithms, powers of univariate power series can be computed [6, 
Section 13] in L(n) = C'(M(n)). As a consequence, if ip is an r-variate sparse polynomial with 
m monomials of any degree, then L(n) = 0{mr M(n)). 

Another important class of systems with such a subquadratic L(A^) is provided by rational 
systems, where each ipi is in IK(yi, . . . ,yr-)- Supposing that the complexity of evaluation of 99 is 
bounded by L (i.e., for any point z in W at which ip is well-defined, the value ip{z) can be computed 
using at most L operations in K), then, the Baur-Strassen theorem [2] implies that the complexity 
of evaluation of the Jacobian Jac((^) is bounded by 5L, and therefore, we can take L(ri) = M(n)L 
in the statement of Theorem 3. 

1.5. Basic Complexity Notation. Our algorithms ultimately use, as a basic operation, multipli- 
cation of matrices with entries that are polynomials (or truncated power series). Thus, to estimate 
their complexities in a unified manner, we use a function MM : N x N — > N such that any two r x r 
matrices with polynomial entries in ]K[t] of degree less than d can be multiplied using MM(r, c?) 
operations in K. In particular, MM(1, d) represents the number of base field operations required to 
multiply two polynomials of degree less than d, while MM(r, 1) is the arithmetic cost of scalar r x r 
matrix multiplication. For simplicity, we denote MM(l,d) by M{d) and we have MM(r, 1) = 0{r^), 
where 2 < w < 3 is the so-called exponent of the matrix multiplication, see, e.g., [9] and [14]. 

Using the algorithms of [36, 10], one can take M{d) in 0{d\ogd\og\ogd); over fields supporting 
FFT, one can take M(d) in 0{d\ogd). By [10] we can always choose UU{r,d) in 0{r'^ M((i)), but 
better estimates are known in important particular cases. For instance, over fields of characteristic 
or larger than 2d, we have MM(r, d) = 0{r'^d + r^ M((i)), see [5, Th. 4]. To simplify the complexity 
analyses of our algorithms, we suppose that the multiplication cost function MM satisfies the 
following standard growth hypotheses for all integers di,d2 and r: 

(2) MMir,d^d,)<dlMM{r,d,) and MM(r, d,) ^ MM(r,d,) .^^^^^^^ 

In particular. Equation (2) implies the inequalities 

MM(r, 2'') + MM(r, 2''"^) + M{r, 2''-'^) + ■■■ + MM(r, 1) < 2MM(r, 2^^), 

M(2^) + 2M(2^-^) + 4M(2^-2) + . . . + 2^M(1) < (k + 1)M(2''). 

These inequalities are crucial to prove the estimates in Theorem 1 and Theorem 3. Note also that 
when the available multiplication algorithm is slower than quasi- linear (e.g., Karatsuba or naive 



multiplication), then in the second inequality, the factor (k + 1) can be replaced by a constant and 
thus the estimates M{N)\ogN in our complexities become M(A^) in those cases. 

1.6. Notation for Truncation. It is recurrent in algorithms to split a polynomial into a lower 
and a higher part. To this end, the following notation proves convenient. Given a polynomial /, 
the remainder and quotient of its Euclidean division by t'' are respectively denoted [/] and [/J^. 
Another occasional operation consists in taking a middle part out of a polynomial. To this end, we 

let [f]f, denote [/] . Furthermore, we shall write f = g mod t when two polynomials or series 

L J k 

f and g agree up to degree k — I included. To get a nice behaviour of integration with respect to 
truncation orders, all primitives of series are chosen with zero as their constant coefficient. 

2. Newton Iteration for Systems of Linear Differential Equations 

Let Y'{t) = A{t)Y{t) + B{t) be a linear differential system, where A{t) and B{t) are r x r ma- 
trices with coefficients in ]K[[i]]. Given an invertible scalar matrix Yq, an integer A^ > 1, and the 
expansions of A and B up to precision A^, we show in this section how to compute efficiently the 
power series expansion at precision N of the unique solution of the Cauchy problem 

Y'{t) = A{t)Y{t) + B{t) and Y{0) = Yq. 

This enables us to answer problems I and i, the latter being a particular case of the former (through 
the application to a companion matrix). 

2.1. Homogeneous Case. First, we design a Newton-type iteration to solve the homogeneous 
system Y' = A{t)Y. The classical Newton iteration to solve an equation (j){y) = is ^k+i = Yf^ — Uf^, 
where [/« is a solution of the linearized equation D(J)\y^ ■ U = (j){Y^) and D(J)\y^ is the differential of (j) 
at Yr. We apply this idea to the map cp : Y >-^ Y' — AY. Since (p is linear, it is its own differential 
and the equation for U becomes 

u'-au = y;,- ay,. 

Taking into account the proper orders of truncation and using Lagrange's method of variation of 
parameters [24, 20], we are thus led to the iteration 

= Y,-\U,]^^^\ 

=Y.f\Y-^f''{n-\Ar''Y, 

Thus we need to compute (approximations of) the solution Y and its inverse simultaneously. Now, 
a well-known Newton iteration for the inverse ZofYis 

(4) Z,+i = \Z, + Z,{Ir - YZ,)-] ^^^' . 

It was introduced by Schulz [37] in the case of real matrices; its version for matrices of power series 
is given for instance in [28]. 

Putting together these considerations, we arrive at the algorithm SolveHomDiffSys in Figure 1, 
whose correctness easily follows from Lemma 1 below. Remark that in the scalar case (r = 1) 
algorithm SolveHomDiffSys coincides with the algorithm for power series exponential proposed by 
Hanrot and Zimmermann [18]; see also [3]. In the case r > 1, ours is a nontrivial generaliza- 
tion of the latter. Because it takes primitives of series at precision A^, algorithm SolveHomDiff- 
Sys requires that the elements 2, 3, . . . , A^ — 1 be invertible in K. Its complexity C satisfies the 
recurrence C(m) = C(m/2) -|- 0{M{r,m)), which implies — using the growth hypotheses on M — 
that C(A) = C'(M(r, A)). This proves the first assertion of Theorem 1. 




SolveHomDiffSys(A,iV,yo) 


Input: Fo, ^o, • • • , An-2 in A^rxr-W, A = Y. Aif. 


Output: y = T.'i=Q'Yif in Mrxri'^M such that 
Y' = AY mod t^-\ and Z = y-i modt^'^. 


Z^Y,-^ 


m^2 


while m < N/2 do 

Z^Z+ Z{Ir-YZ)]"' 

Y ^Y-\y(J Z{Y' - \A] 2™-^ Y)) 


2m 


m ^- 2m 


return Y,Z 



Figure 1. Solving the Cauchy problem Y' = A{t)Y, Y{0) = Yq by Newton iteration. 



Lemma 1. Let m be an even integer. Suppose thatYfQ\,ZfQ\ in Airxr{^[t]) satisfy 



Y{o)Z(o) 



modt™/2 and y^'g) 



AY, 



(0) 



mod t 



m— 1 



and that they are of degree less than m/2 and m, respectively. Define 



Z •— \Z{0) (2-^r 



%^(o))r 



and Y := 



Y 



(0) 



Z{Y(o)-AY^o)) 



2m 



mod t 



yio)Z(o)] 



2m- 1 



Then Y and Z satisfy the equations 

(5) Ir-YZ = mod t™ and Y' - AY = 

Proof. Using the definitions of Y and Z, it follows that 

Ir-YZ= {Ir - ^(0)^(0))' -{Y- y(0))Z(0)(2/r 

Since by hypothesis Ir — Y/q\ Zim and Y — Yiq\ are zero modulo t"^/^ ^ the right-hand side is zero mod- 



mod f' 



ulo t™ and this establishes the first formula in Equation (5). Similarly, write Q = J Z(YL-. 
and observe (5 = mod t'" to get the equality 



AY, 



(0). 



Y'-AY = {I- YZ)(XL - AY^o)) - {Ylo) " AY^o))Q mod t 



2m-l 



Now, y^'o) - Ay(o) = mod i'"-\ while Q and /, 
right-hand side of the last equation is zero modulo t^m-i 



Y Z are zero modulo t™ and therefore the 
proving the last part of the lemma. D 



2.2. General Case. We want to solve the equation Y' = AY + B, where i? is an r x r matrix 
with coefficients in ]K[[t]]. Suppose that we have already computed the solution Y of the associate 
homogeneous equation Y' = AY, together with its inverse Z. Then, by the method of variation of 
parameters, Yn\ —Yf ZB is a particular solution of the inhomogeneous problem, thus the general 
solution has the form Y = y^ \ -|- Y. 

Now, to compute the particular solution Yn\ at precision N, we need to know both Y and Z 
at the same precision N. To do this, we first apply the algorithm for the homogeneous case and 
iterate (4) once. The resulting algorithm is encapsulated in Figure 2. 
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SolvelnhomDifFSys(^, B, N, Yq) 

Input: Yo,Ao,...,An-2 in Mrxr{^), A = J2At\ 

Bo,..., Bn-2 in MrxrO^), B{t) = Z Bit\ 

Output: Yi, . . . , Yat-i in Mry.r{J^) such that 
Y = Yo + Y. YiV satisfies Y' = AY + B mod t^~'^. 

Y,Z ^ SolveHomDiffSys(A, N, Yq) 



Z ^ Z + 



Z{Ir - YZ) 



Y 



YjiZB) 



N 



N 



Y ^Y + Y 
return Y 



Figure 2. Solving the Cauchy problem Y' = AY + B, Y{0) = Yq by Newton iteration. 



3. DiVIDE-AND-CONQUER ALGORITHM 

We now give our second algorithm, which allows us to solve problems ii and II and to fin- 
ish the proof of Theorem 1. Before entering a detailed presentation, let us briefly sketch the 
main idea in the particular case of a homogeneous differential equation Cy = 0, where £ is a lin- 
ear differential operator in 6 = t-^ with coefficients in I[C[[i]]. (The introduction of 6 is only for 
pedagogical reasons.) The starting remark is that if a power series y is written as yo + f^yi, 
then C{5)y = C{6)yo + t'^C{6 + m)yi. Thus, to compute a solution y of C{6)y = mod t'^"^, it 
suffices to determine the lower part of y as a solution of C{5)yo = mod t™, and then to compute 
the higher part yi, as a solution of the inhomogeneous equation C{6 + m)yi = —R mod f", where 
the rest R is computed so that C{6)yo = t^R mod i^"^. 

Our algorithm DivideAndConquer makes a recursive use of this idea. Since, during the recursions, 
we are naturally led to treat inhomogeneous equations of a slightly more general form than that 
of II we introduce the notation £{s,p,m) for the vector equation 

ty' + {pir - tA)y = s mod f^ . 

The algorithm is described in Figure 3. Choosing p = and s{t) = tb{t) we retrieve the equation of 
problem II. Our algorithm Solve to solve problem II is thus a specialization of DivideAndConquer, 
defined by making So\ve{A,b,N,v) simply call DivideAndConquer(i^, t6, 0, A^, f). Its correctness 
relies on the following immediate lemma. 

Lemma 2. Let A in Airxri^[[t]]), s in Airxi(^[[t]]), and let p,d in N. Decompose [s]"* into a 
sum So + t'^si. Suppose that yo i^ A^r-xi(]K[[t]]) satisfies the equation £{so,p,d), set R to be 

,"] m—d 

{ty'o + {pIr - tA)yo - so)/r 

and let yi in Mrxi{^[[t]]) be a solution of the equation £{si — R,p + d,m — d). Then the sum 
y '■= yo + f^yi is a solution of the equation £{s,p, m). 

The only divisions performed along our algorithm Solve are by 1, . . . , N — 1. As a consequence 
of this remark and of the previous lemma, we deduce the complexity estimates in the proposition 
below; for a general matrix A, this proves point (c) in Theorem 1, while the particular case when 
A is companion proves point (b). 
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DivideAndConquer(yl, s,p, m, v) 

Input: Ao,...,Am-i in MrxrO^), A = J2At\ 
So,... ,Sm-i,v in Mrxii^), s = ^Sif, p in K. 

Output: y = X^j^o Vi^^ ^^ ■M.rxiO^)[t] such that 
ty' + {pir - tA)y = s mod i™, y(0) = u. 
If r?i = 1 then 

if p = then 
return v 

else return p^^s(O) 
end if 
d <- [m/2\ 

yo ^- DivideAndConquer(yl, s,p, d, f) 
R^[s-ty'f^- {pIr - tA)yo]'^ 
yi ^- DivideAndConquer(^, R,p + d,m — d,v) 
return yo + f^yi 



Figure 3. Solving ty' + (plr — tA)y = s mod i™", y{0) = v, by divide-and-conquer. 

Proposition 1. Given the first m terms of the entries of A E A^,.xr(]K[[t]]) and of s £ A4rxii^[[t]]), 
given V G Airxii^), algorithm DivideAndConquer(j4, s,p, m, t") computes a solution of the linear dif- 
ferential system ty' + {plr — tA)y = s mod f", y(0) = v, using 0{r'^ \^{m) logm) operations in K. 
If A is a companion matrix, the cost reduces to C{m) = 0{r\\/\{m)\ogm). 

Proof. The correctness of the algorithm follows from the previous Lemma. The cost C(m) of the 
algorithm satisfies the recurrence 

C(m) = C([m/2j) + C([m/2])+r2M(m) + C'(rm), 

where the term r^ \J\{m) comes from the application of A to yo used to compute the rest R. 
From this recurrence, it is easy to infer that C{m) = 0(r^ M(?7i) logm). Finally, when A is a 
companion matrix, the vector R can be computed in time 0(rM(m)), which implies that in this 
case C(7Ti) = 0{rM{m)\ogm). D 

4. Faster Algorithms for Special Coefficients 

4.1. Constant Coefficients. Let A be a constant r ^ r matrix and let f be a vector of initial 
conditions. Given A^ > 1, we want to compute the first N coefficients of the series expansion of a 
solution y in 7Wrxi(]K[[t]]) of y' = Ay, with y(0) = v. In this setting, many various algorithms have 
been proposed to solve problems i, ii, I, and II, see for instance [32, 33, 21, 12, 29, 25, 26, 16, 17, 
19, 30, 27]. Again, the most naive algorithm is based on the method of undetermined coefficients. 
On the other hand, most books on differential equations, see, e.g., [20, 11, 1] recommend to simplify 
the calculations using the Jordan form of matrices. The main drawback of that approach is that 
computations are done over the algebraic closure of the base field K.. The best complexity result 
known to us is given in [27] and it is quadratic in r. 

We concentrate first on problems ii and II (computing a single solution for a single equation, 
or a first-order system). Our algorithm for problem II uses 0{r^ \ogr -\- NM{r)) operations in K 
for a general constant matrix A and only C'(AM(r)/r) operations in K in the case where ^ is a 
companion matrix (problem ii). Despite the simplicity of the solution, this is, to the best of our 
knowledge, a new result. 



In order to compute i/n = X^j^q A^vt^/il, we first compute its Laplace transform zjy = X]i=o ^*^^ 



indeed, one can switch from yjy to zn using only 0{Nr) operations in K. The vector z^v is the 
truncation at order A^ + 1 of z = X^j>o^*^^* — {^ ~ tA)~^v. As a byproduct of a more difficult 
question, [38, Prop. 10] shows that zj^ can be computed using 0{Nr^~^) operations in IC. We 
propose a solution of better complexity. 

By Cramer's rule, z is a vector of rational functions Zi{t), of degree at most r. The idea is to first 
compute z as a rational function, and then to deduce its expansion modulo t^^^. The first part of 
the algorithm does not depend on N and thus it can be seen as a precomputation. For instance, 
one can use [38, Corollary 12], to compute z in complexity C'(r'^ log r). In the second step of the 
algorithm, we have to expand r rational functions of degree at most r at precision N. Each such 
expansion can be performed using 0{NM{r)/r) operations in K, see, e.g., the proof of [4, Prop. 1]. 
The total cost of the algorithm is thus 0{r^ log r + NM(r)). We give below a simplified variant 
with same complexity, avoiding the use of the algorithm in [38] for the precomputation step and 
relying instead on a technique which is classical in the computation of minimal polynomials [9]. 

(1) Compute the vectors v, Av, A^v, A'^v, . . . , A^^v in 0{r^ log r), as follows: 
for K from 1 to 1 + log r do 

(a) compute A^ 

(b) compute A^" x [v\Av\ ■ ■ ■ [A^^'^u], thus getting [A^^-ylA^^+^ul • • • \A'^'"^^~'^v] 

(2) For each j = 1, . . . , r: 

(a) recover the rational fraction whose series expansion is Yl iA'^v)jt'^ by Pade approxima- 
tion in 0(M(r) logr) operations 

(b) compute its expansion up to precision t "'"^ in 0{N M{r)/r) operations 

(c) recover the expansion of y from that of z, using 0{N) operations. 

This yields the announced total cost of 0{r'^ logr + NM{r)) operations for problem II. 

We now turn to the estimation of the cost for problems i and I (bases of solutions). In the 
case of equations with constant coefficients, we use the Laplace transform again. If y = X]j>o Vit^ 
is a solution of an order r equation with constant coefficients, then the sequence (zj) = {i\yi) is 
generated by a linear recurrence with constant coefficients. Hence, the first terms zi,...,zn can 
be computed in 0(iVM(r)/r) operations, using again the algorithm described in [4, Prop. 1]. For 
problem I, the exponent w + 1 in the cost of the precomputation can be reduced to w by a very 
different approach; we cannot give the details here for space limitation. 

4.2. Polynomial Coefficients. If the coefficients in one of the problems i, ii, I, and II are 

polynomials in ]K[t] of degree at most d, using the linear recurrence of order d satisfied by the 
coefficients of the solution seemingly yields the lowest possible complexity. Consider for instance 
problem II. Plugging A = Y.i=o t'A, b = Yli=o **^«' ^^^ ^ = Si>0 *'?/« ™ ^^^ equation y' = Ay + b, 
we arrive at the following recurrence 

yk+d+i = {d + k+ l)~'^{Adyk + Ad^iyk+i H h Aoyk+d + bk+d), for ah k > -d. 

Thus, to compute yo, . . . ,y]\f, we need to perform Nd matrix-vector products; this is done us- 
ing 0{dNr'^) operations in K. A similar analysis implies the other complexity estimates in the 
third column of Table 1. 

5. Non-linear Systems of Differential Equations 

Let (p{t,y) = {ifi{t,y), . . . ,ifrit,y)), where each ipi is a power series in ]K[[t, yi, . . . ,yr]]- We 
consider the first-order non-linear system in y 

{M) y[{t) = ipi{t,yi{t),...,yr{t)), ..., y^t) = Mt,yi{t), . . . ,yr{t)). 
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To solve (M), we use the classical technique of linearization. The idea is to attach, to an 
approximate solution yo of (■^)) ^ tangent system in the new unknown z, 

[T, yo) z' = Jac{ip){yo)z - y'o + 9?(yo), 

which is linear and whose solutions serve to obtain a better approximation of a true solution 
of (AA). Indeed, let us denote by {Mm),{%n) the systems (AA),(T) where all the equalities are 
taken modulo t"^. Taylor's formula states that the expansion ip{y + z) — ip{y) — Ja.c{if){y)z is 
equal to modulo z^. It is a simple matter to check that if y is a solution of (Mm) and if 2; is a 
solution of ('72m, y), then y + z is a solution of {J\f2m)- This justifies the correctness of Algorithm 
SolveNonLinearSys. 

To analyze the complexity of this algorithm, it suffices to remark that for each integer k between 
1 and [log A^J , one has to compute one solution of a linear inhomogeneous first-order system at 
precision 2" and to evaluate 99 and its Jacobian on a series at the same precision. This concludes 
the proof of Theorem 3. 





SolveNonLinearSys(0, v) 






Input: 


N in N, ip{t, y) in 


lC[[i,yi,...,yr]] 


'', f in 


W 


Output: first N terms 
that y{t)' = ip{t,y{t)) mo 


of a y{t) in 
d t^ and y(0) = 


V. 


such 


m ^ 1 










y ^v 
while m 

A^ 
b^ 


< N/2 do 
- \3ac{^){y)Y^ 

W{y)-yT' 

Solve(A,6,27n,0) 








y*- 


y + z 








m ^ 


-2m 








return y 











Figure 4. Solving the non-linear differential system y' = ip{t,y), y(0) = v. 



6. Implementation and Timings 

We implemented our algorithms SolveDiffHomSys and Solve in Magma [31] and ran the programs 
on an Athlon processor at 2.2 GHz with 2 GB of memory. We used Magma's built-in polynomial 
arithmetic (using successively naive, Karatsuba and FFT multiplication algorithms), as well as 
Magma's scalar matrix multiplication (of cubic complexity in the ranges of our interest). We 
give three tables of timings. First, we compare in Figure 5 the performances of our algorithm 
SolveDiffHomSys with that of the naive quadratic algorithm, for computing a basis of (truncated 
power series) solutions of a homogeneous system. The order of the system varies from 2 to 16, 
while the precision required for the solution varies from 256 to 4096; the base field is Z/pZ, where 
p is a 32-bit prime. 

Then we display in Figure 6 and Figure 7 the timings obtained respectively with algorithm Solve- 
DiffHomSys and with the algorithm for polynomial matrix multiplication PolyMatMul that was 



All the computations have been done on the machines of the MEDICIS ressource center 
http : //medicis . polytechnique . f r . 
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N'-.r 


2 


4 


8 


16 


256 


0.02 vs. 2.09 


0.08 vs. 6.11 


0.44 vs. 28.16 


2.859 vs. 168.96 


512 


0.04 vs. 8.12 


0.17 vs. 25.35 


0.989 vs. 113.65 


6.41 vs. 688.52 


1024 


0.08 vs. 32.18 


0.39 vs. 104.26 


2.30 vs. 484.16 


15 vs. 2795.71 


2048 


0.18 vs. 128.48 


0.94 vs. 424.65 


5.54 vs. 2025.68 


36.62 vs. > 3hours * 


4096 


0.42 vs. 503.6 


2.26 vs. 1686.42 


13.69 vs. 8348.03 


92.11 vs. > 1/2 day* 



Figure 5. Computation of a basis of a linear homogeneous system with r equations, 
at precision A^: comparison of timings (in seconds) between algorithm SolveDiffHom- 
Sys and the naive algorithm. Entries marked with a '•' are estimated timings. 



used as a primitive of SolveDifFHomSys. The similar shapes of the two surfaces indicate that 
the complexity prediction of point (a) in Theorem 1 is well respected in our implementation: 
SolveDifFHomSys uses a constant number (between 4 and 5) of polynomial multiplications; note 
that the abrupt jumps at powers of 2 reflect the performance of Magma's FFT implementation of 
polynomial arithmetic. 



time {in seconds) 




time (in seconds) 




Figure 6. Timings of algo- 
rithm PolyMatMul. 



Figure 7. Timings of algo- 
rithm SolveDiffHomSys. 



In Figure 8 we give the timings for the computation of one solution of a linear differential 
equation of order 2, 4, and 8, respectively, using our algorithm Solve in Section 3. Again, the shape 
of the three curves experimentally confirms the nearly linear behaviour established in point (b) 
of Theorem 1, both in the precision N and in the order r of the complexity of algorithm Solve. 
Finally, Figure 9 displays the three curves from Figure 8 together with the timings curve for the 
naive quadratic algorithm computing one solution of a linear differential equation of order 2. The 
conclusion is that our algorithm Solve becomes very early superior to the quadratic one. 

We also implemented our algorithms of Section 4.1 for the special case of constant coefficients. 
For reasons of space limitation, we only provide a few experimental results for problem II. Over 
the same finite field, we computed: a solution of a linear system with r = 8 at precision A^ ~ 10^ 
in 24.53s; one at doubled precision in doubled time 49.05s; one for doubled order r = 16 in doubled 
time 49.79s. 
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Figure 8. Timings of algo- 
rithm Solve for equations of 
orders 2, 4, and 8. 



Figure 9. Same, compared 
to the naive algorithm for a 
second-order equation. 
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